This is the notebook for the Module 3 Programming Assignment.
Here are a few tips for using the iPython HTML notebook:
You should rename this notebook to be <your JHED id>_PR3.ipynb Do it right now.
Make certain the entire notebook executes before you submit it. (See Hint #5 above)
Change the following variables:
In [1]:
name = "Shehzad Qureshi"
jhed_id = "squresh6"
if name == "Student Name" or jhed_id == "sname1":
raise Exception( "Change the name and/or JHED ID...preferrably to yours.")
Add whatever additional imports you require here. Stick with the standard libraries and those required by the class. The import gives you access to these functions: http://ipython.org/ipython-doc/stable/api/generated/IPython.core.display.html (Copy this link) Which, among other things, will permit you to display HTML as the result of evaluated code (see HTML() or display_html()).
In [2]:
import networkx as nx
from copy import deepcopy
from operator import itemgetter
import matplotlib.pyplot as plt
%matplotlib inline
In this programming assignment, you will be using your new understanding Constraint Satisfaction Problems to color random maps. As we know from the Four Color Theorem any division of a plane into contiguous regions can be colored such that no two adjacent regions are the same color by using only four colors.
From the book, we know that we can translate this into a CSP where the map is represented as a planar graph where the goal is to color all the nodes such that no adjacent nodes are colored the same color.
def color_map( world)
where world is a Tuple of (V,E) where V is a vertices list [n1, n2, n3, n4, ...] and E is an edge list [(n1, n3), (n4, n7), ...]. You may want to convert the Edge list to an Adjacency list inside color_map() or not. It's up to you. Each $n_i$ can be a string ("France") or a node number (7).
color_map should return a Dict {n1: "color1", n2: "color2", ...}
There are two possibilies:
If your computational geometry is up to snuff, you can generate random maps. These are essentially Delauney Triangulations of randomly generated points (where points are the centers of the regions in the map). Color maps of varying sizes (10 nodes, 50 nodes, 100 nodes).
You can convert real maps into planar graphs. For your testing, you should use the Counties of Connecticut. You then must color in the Countries of Europe. Do not confuse regions with countries. For example, Northen Ireland should be colored the same as the United Kingdom (they are together Great Britain). Additionally, neighboring islands that are different countries, shouldn't be colored the same.
Side question...do these maps require four colors?
You will need to implement backtracking and forward checking. You will also need to implement Minimum Remaining Values or Degree Heuristic to pick variables (tell me which one) and Least Constraining Value to pick values. Break ties in ascending order (least to most).
I suggest you implement backtracking then forward checking then either minimum remaining values or degree heuristic then least constraining value. This maximizes partial credit should you be unable to complete everything. Only submit a working progam (if you make it through backtracking and forward checking but don't get to the others, remove their code). Please change the "?" below into "yes" or "no" indicating which elements you were able to implement:
backtracking: yes forward checking: yes minimum remaining values: yes degree heuristic: yes (optional) least contraining value: yes
Your code should print out traces (or debugging) of what it is doing so that I can see the operation of the algorithm.
You can put your helper functions and markdown documentation in cells below here. As before, your Markdown documentation should indicate both what the function does and explains the concept behind the algorithm (or contribution to the algorithm) that is being implemented (where appropriate...not all helper functions will represent a concept in the algorithm). For example, wherever and however backtracking is implemented, the important and role of backtracking in the algorithm should be explained. Similarly everything else.
Your code should print out traces (or debugging) of what it is doing so that we can see the operation of the algorithm (Note that because of the way that IPython works, this will slow things down a bit). You're more than welcome to develop in a regular IDE, paste functions into the notebook and add Markdown commentary later...just make sure you do it).
You can put your helper functions and markdown documentation in cells below here:
Representing a Constraint Satisfaction Problem
A constraint satisfaction problem (CSP) consists of a set of variables (V), a set of domains (D) for each variable containing a set of values assignable to the variable, and a set of constraints (C). CSP = (V, D, C).
Variables will be represented as a simple list. The type of variable should be irrevelant, be it strings or integers or objects.
Domains will be represented by a dictionary. Each key in the dictionary shall be a variable and the values of each key shall be the domain of that variable.
Constraints will be represented by an adjacency list in the form of a dictionary. Each key shall be a variable in the problem and the values for each key shall be a set of neighbor nodes. We will use a set to eliminate the possibility of duplicates in the event the input is a digraph.
We will now write helper functions to assist with a basic backtracking-search algorithm, adapted from Artificial Intelligence: A Modern Approach.
function check_completeness
We will need to check if the current solution is complete. This is as simple as checking if the solution has assigned values to every variable in the problem.
In [3]:
def check_completeness(solution, variables):
return True if len(solution) == len(variables) else False
function static_variable_selection
For our variable selection heuristic, this will be a simple function that returns the first unassigned variable in the list.
In [4]:
def static_variable_selection(solution, csp, debug=True):
unassigned_variables = [x for x in csp[0] if x not in solution]
variable = unassigned_variables[0]
if debug: print " Selected variable: {0}".format(variable)
return variable
function static_domain_selection
For our domain-value selection heurisitc, this will be a simple function that returns the value in a static ordering, e.g. (1, 2, 3, 4).
In [5]:
def static_domain_selection(variable, assignment, csp, debug=True):
for domain_value in csp[1][variable]:
if debug: print " Selected value {0} for {1}".format(domain_value, variable)
yield domain_value
function check_value_consistency
We will need to check if assigning a value to a variable will conflict with our current assignments.
In [6]:
def check_value_consistency(variable, value, assignment, constraints, debug=True):
if assignment is None: return False
for v in constraints:
if v in assignment:
if value is assignment[v]:
if debug: print " {0} : {1} conflicts with {2} : {3}".format(variable, value, v, assignment[v])
return False
return True
function no_inference
A future template for checking inference in the search. This function does nothing.
In [7]:
def no_inference(csp, variable, value, assignment, debug=True):
return True
function backtrack
Our recurse backtrack function. Assignment is an empty dictionary which will hold the current solution. csp is a CSP tuple (V, D, C). params are the heurisitcs, which are configurable by the caller.
The algorithm works by using a depth-first search to assign values to variables based on certain heuristics. If a value cannot be assigned, it backtracks to the previous value assignment and tries a different combination. It does this recursively until a solution is found, if there is one.
In [8]:
def backtrack(assignment, csp, params, debug=True):
select_unassigned_variable, order_domain_values, inference = params
if debug: print "Solution: {0}".format(assignment)
if check_completeness(assignment, csp[0]): return True, assignment
variable = select_unassigned_variable(assignment, csp, debug=debug)
for value in order_domain_values(variable, assignment, csp, debug):
if check_value_consistency(variable, value, assignment, csp[2][variable], debug=debug):
assignment[variable], m_csp = value, deepcopy(csp)
if inference(csp, variable, value, assignment, debug=debug):
result, assignment = backtrack(deepcopy(assignment), csp, params, debug=debug)
if result: return True, assignment
del assignment[variable]
csp = m_csp
return False, assignment
function backtracking_search
This is the main entry point for the search algorithm. params is changeable depending on what heuristics we want to implement.
In [9]:
def backtracking_search(csp, params, debug=True):
solution = dict()
return backtrack(solution, csp, params, debug=debug)
function get_adjacency_list
We need a way of converting nodes and edges provided into an adjacency list. This function loops through the list of edges and gathers the unique neighboring nodes for each node and stores it in a list. Each list is then added to a dictionary which holds the adjacency list for the CSP.
In [10]:
def get_adjacency_list(nodes, edges):
adjacency_list = dict()
for node in nodes:
aList = set()
[aList.add(x) for x, y in edges if x != node and y == node]
[aList.add(y) for x, y in edges if y != node and x == node]
adjacency_list[node] = aList
return adjacency_list
We will need placeholders for future functions, blame IPython for its iterative approach.
In [11]:
def minimum_remaining_values():
pass
def least_contraining_value():
pass
def forward_checking():
pass
function color_map
This is the public function that will color a map given the nodes and edges as a tuple. Colors is changeable for testing purposes. The default heuristics are Minimum Remaining Values, Least Constraining Value and Forward Checking. Debugging is turned on by default as well which will provide traces as the algorithm progresses.
In [12]:
def color_map(world, colors=["blue", "green", "red", "yellow"], params=(minimum_remaining_values, least_contraining_value, forward_checking), debug=True):
variables = world[0]
domains = dict([(x, colors) for x in variables])
constraints = get_adjacency_list(world[0], world[1])
result, solution = backtracking_search((variables, domains, constraints), params, debug=debug)
if not result: print "No Solution"
else: return solution
function draw_color_map
We need an easy way of visually checking for correctness. We can use NetworkX to plot a graph and color the nodes as appropriate. Laying the nodes out in a geographically correct manner will be left for later if time permits.
In [13]:
def draw_color_map(world, colors, pos=None):
if colors is None: return
fig = plt.figure()
G = nx.Graph()
if pos is None:
pos = nx.spring_layout(G)
G.add_nodes_from(world[0])
G.add_edges_from(world[1])
node_colors = [colors[x] for x in G.nodes()]
nx.draw(G, node_color=node_colors, node_size=700, alpha=0.5)
plt.axis('off')
fig.set_size_inches(8, 8)
plt.show()
Testing
First we will test with the counties of Connecticut. This function was re-used from the link provided in the discussion forums.
In [14]:
def connecticut_map():
nodes = ["fairfield", "litchfield", "new haven", "hartford", "middlesex", "tolland", "new london", "windham"]
edges = [("fairfield", "litchfield"), ("fairfield", "new haven"),
("litchfield", "new haven"), ("litchfield", "hartford"),
("new haven", "hartford"), ("new haven", "middlesex"),
("hartford", "middlesex"), ("hartford", "tolland"), ("hartford", "new london"),
("middlesex", "new london"),
("tolland", "new london"), ("tolland", "windham"),
("new london", "windham")]
return (nodes, edges)
In [15]:
world = connecticut_map()
solution = color_map(world, params=(static_variable_selection, static_domain_selection, no_inference))
draw_color_map(world, solution)
Seems like only three colors are needed for counties of Connecticut!
This is a function for the map of Ireland, also reused from the discussion board.
In [16]:
def ireland_map():
nodes = ["Carlow", "Cavan", "Clare", "Cork", "Donegal", "Dublin", "Galway", "Kerry", "Kildare", "Kilkenny",
"Laois", "Leitrim", "Limerick", "Longford", "Louth", "Mayo", "Meath", "Monaghan", "Offaly",
"Roscommon", "Sligo", "Tipperary", "Waterford", "Westmeath", "Wexford", "Wicklow"]
edges = [("Carlow", "Wexford"),("Carlow", "Kilkenny"),("Carlow", "Laois"),("Carlow", "Kildare"),("Carlow", "Wicklow"),
("Cavan", "Monaghan"),("Cavan", "Meath"),("Cavan", "Westmeath"),("Cavan", "Longford"),("Cavan", "Leitrim"),
("Clare", "Limerick"),("Clare", "Tipperary"),("Clare", "Galway"),
("Cork", "Kerry"),("Cork", "Limerick"),("Cork", "Tipperary"),("Cork", "Waterford"),
("Donegal", "Leitrim"),
("Dublin", "Wicklow"),("Dublin", "Kildare"),("Dublin", "Meath"),
("Galway", "Tipperary"),("Galway", "Offaly"),("Galway", "Roscommon"),("Galway", "Mayo"),
("Kerry", "Limerick"),
("Kildare", "Wicklow"),("Kildare", "Laois"),("Kildare", "Offaly"),("Kildare", "Meath"),
("Kilkenny", "Waterford"),("Kilkenny", "Tipperary"),("Kilkenny", "Laois"),("Kilkenny", "Wexford"),
("Laois", "Tipperary"),("Laois", "Offaly"),
("Leitrim", "Longford"),("Leitrim", "Roscommon"),("Leitrim", "Sligo"),
("Limerick", "Tipperary"),
("Longford", "Westmeath"),("Longford", "Roscommon"),
("Louth", "Meath"),("Louth", "Monaghan"),
("Mayo", "Roscommon"),("Mayo", "Sligo"),
("Meath", "Offaly"),("Meath", "Westmeath"),
("Offaly", "Tipperary"),("Offaly", "Roscommon"),("Offaly", "Westmeath"),
("Roscommon", "Sligo"),("Roscommon", "Westmeath"),
("Tipperary", "Waterford"),
("Wexford", "Wicklow")]
return (nodes, edges)
In [17]:
world = ireland_map()
solution = color_map(world, params=(static_variable_selection, static_domain_selection, no_inference), debug=False)
draw_color_map(world, solution)
This is the map of Europe, reused from the discussion board and slightly modified.
In [18]:
def europe_map():
nodes = ["Iceland","Ireland","United Kingdom","Portugal","Spain","France","Andorra","Monaco","Belgium",
"Luxembourg","Germany","Switzerland","Italy","Netherlands","Liechtenstein","Austria","Malta","Cyprus",
"Vatican City","San Marino","Slovenia","Denmark","Poland","Czech Republic","Norway","Sweden","Finland",
"Russia","Slovakia","Hungary","Croatia","Lithuania","Belarus","Ukraine","Estonia","Latvia","Georgia",
"Azerbaijan","Armenia","Turkey","Moldova","Romania","Bosnia & Herzegovina","Serbia","Bulgaria",
"Montenegro","Kosovo","Macedonia","Albania","Greece"]
edges = [("Iceland","United Kingdom"),("Iceland","Ireland"),("Iceland","Norway"),
("Norway","Denmark"),("Norway","United Kingdom"),("Norway","Sweden"),("Norway","Netherlands"),
("Sweden","Finland"),("Sweden","Denmark"),("Sweden","Estonia"),("Sweden","Latvia"),("Sweden","Lithuania"),("Sweden","Poland"),
("Finland","Estonia"),("Finland","Russia"),
("Estonia","Russia"),("Estonia","Latvia"),
("Latvia","Lithuania"),("Latvia","Belarus"),("Latvia","Russia"),
("Lithuania","Belarus"),("Lithuania","Poland"),
("United Kingdom","Ireland"),("United Kingdom","France"),("United Kingdom","Belgium"),("United Kingdom","Netherlands"),
("Denmark","United Kingdom"),("Denmark","Germany"),("Denmark","Poland"),("Denmark","Netherlands"),
("Portugal","Iceland"),("Portugal","Spain"),
("Spain","Andorra"),("Spain","France"),("Spain","Ireland"),("Spain","United Kingdom"),("Spain","Monaco"),("Spain","Italy"),
("France","Belgium"),("France","Germany"),("France","Switzerland"),("France","Italy"),("France","Monaco"),("France","Luxembourg"),
("Belgium","Netherlands"),("Belgium","Luxembourg"),("Belgium","Germany"),
("Germany","Luxembourg"),("Germany","Switzerland"),("Germany","Austria"),("Germany","Poland"),("Germany","Czech Republic"),
("Poland","Czech Republic"),("Poland","Slovakia"),("Poland","Belarus"),("Poland","Ukraine"),
("Belarus","Ukraine"),("Belarus","Russia"),
("Switzerland","Italy"),("Switzerland","Austria"),("Switzerland","Liechtenstein"),
("Liechtenstein","Austria"),
("Austria","Czech Republic"),("Austria","Slovakia"),("Austria","Hungary"),("Austria","Slovenia"),("Austria","Italy"),
("Slovakia","Czech Republic"),("Slovakia","Hungary"),("Slovakia","Ukraine"),
("Ukraine","Russia"),("Ukraine","Moldova"),("Ukraine","Turkey"),
("Italy","Monaco"),("Italy","Vatican City"),("Italy","Slovenia"),("Italy","Croatia"),("Italy","Malta"),("Italy","San Marino"),
("Slovenia","Croatia"),("Slovenia","Hungary"),
("Hungary","Romania"),("Hungary","Serbia"),
("Romania","Serbia"),("Romania","Bulgaria"),("Romania","Moldova"),("Romania","Turkey"),
("Moldova","Ukraine"),("Croatia","Bosnia & Herzegovina"),("Croatia","Serbia"),("Croatia","San Marino"),
("Bosnia & Herzegovina","Montenegro"),("Bosnia & Herzegovina","Serbia"),("Bosnia & Herzegovina","San Marino"),
("Montenegro","Serbia"),("Montenegro","Kosovo"),("Montenegro","Albania"),
("Kosovo","Serbia"),("Kosovo","Albania"),("Kosovo","Macedonia"),("Serbia","Bulgaria"),("Serbia","Romania"),
("Albania","Greece"),("Albania","Italy"),("Albania","Macedonia"),
("Macedonia","Greece"),("Macedonia","Serbia"),("Macedonia","Bulgaria"),
("Bulgaria","Romania"),("Bulgaria","Turkey"),("Bulgaria","Greece"),("Romania","Ukraine"),
("Greece","Turkey"),("Greece","Malta"),("Greece","Cyprus"),
("Turkey","Cyprus"),("Turkey","Ukraine"),("Turkey","Georgia"),("Turkey","Armenia"),
("Armenia","Georgia"),("Armenia","Azerbaijan"),("Georgia","Azerbaijan")]
return (nodes, edges)
In [19]:
world = europe_map()
solution = color_map(world, params=(static_variable_selection, static_domain_selection, no_inference), debug=False)
draw_color_map(world, solution)
Forward Checking
Our forward-checking algorithm shall simply delete all values from neighboring nodes' domains.
In [20]:
def forward_checking(csp, variable, value, assignment, debug=True):
unassigned_neighbors = [y for y in csp[2][variable] if y not in assignment]
for x in unassigned_neighbors:
if value in csp[1][x]:
if debug: print " Removing {0} from {1}".format(value, x)
csp[1][x] = [c for c in csp[1][x] if c is not value]
if len(csp[1][x]) == 0: return False
return True
In [21]:
world = connecticut_map()
solution = color_map(world, params=(static_variable_selection, static_domain_selection, forward_checking))
draw_color_map(world, solution)
In [22]:
world = ireland_map()
solution = color_map(world, params=(static_variable_selection, static_domain_selection, forward_checking), debug=False)
draw_color_map(world, solution)
In [23]:
world = europe_map()
solution = color_map(world, params=(static_variable_selection, static_domain_selection, forward_checking), debug=False)
draw_color_map(world, solution)
Minimum Remaining Value
This heuristic, also known as the most constrained variable, is intended to pick a variable that is most likely to fail first, thereby speeding up the backtracking process.
In [24]:
def minimum_remaining_values(solution, csp, debug=True):
unassigned_variables = [x for x in csp[0] if x not in solution]
variable_domain_count = [(x, len(csp[1][x])) for x in unassigned_variables]
variables = sorted(variable_domain_count, key=itemgetter(1))
if debug: print " Selected variable {0} with value count {1}".format(variables[0][0], variables[0][1])
return variables[0][0]
In [25]:
world = connecticut_map()
solution = color_map(world, params=(minimum_remaining_values, static_domain_selection, forward_checking))
draw_color_map(world, solution)
In [26]:
world = ireland_map()
solution = color_map(world, params=(minimum_remaining_values, static_domain_selection, forward_checking), debug=False)
draw_color_map(world, solution)
In [27]:
world = europe_map()
solution = color_map(world, params=(minimum_remaining_values, static_domain_selection, forward_checking), debug=False)
draw_color_map(world, solution)
Least Constraining Value
This heuristic will select the value that is most common amongst neighbor nodes.
In [28]:
def least_contraining_value(variable, assignment, csp, debug=True):
neighbors = csp[2][variable]
value_count = []
for value in csp[1][variable]:
count = 0
for neighbor in neighbors:
if value in csp[1][neighbor]:
count = count + 1
value_count.append((value, count))
counted = sorted(value_count, key=itemgetter(1), reverse=True)
for value, count in counted:
if debug: print " Selected value {0} for {1}".format(value, variable)
yield value
In [29]:
world = connecticut_map()
solution = color_map(world, params=(minimum_remaining_values, least_contraining_value, forward_checking))
draw_color_map(world, solution)
In [38]:
world = ireland_map()
solution = color_map(world, params=(minimum_remaining_values, least_contraining_value, forward_checking), debug=False)
draw_color_map(world, solution)
In [31]:
world = europe_map()
solution = color_map(world, params=(minimum_remaining_values, least_contraining_value, forward_checking), debug=False)
draw_color_map(world, solution)
Degree Heuristic
This is an alternative to Minimum Remaining Values as a heuristic to pick a variable. This will pick the variable with the highest number of constraints.
In [32]:
def degree_heuristic(solution, csp, debug=True):
unassigned_variables = [x for x in csp[0] if x not in solution]
variable_domain_count = [(x, len(csp[2][x])) for x in unassigned_variables]
variables = sorted(variable_domain_count, key=itemgetter(1), reverse=True)
if debug: print " Selected variable {0} with constraint count {1}".format(variables[0][0], variables[0][1])
return variables[0][0]
In [33]:
world = connecticut_map()
solution = color_map(world, params=(degree_heuristic, least_contraining_value, forward_checking),)
draw_color_map(world, solution)
In [34]:
world = ireland_map()
solution = color_map(world, params=(degree_heuristic, least_contraining_value, forward_checking), debug=False)
draw_color_map(world, solution)
In [35]:
world = europe_map()
solution = color_map(world, params=(degree_heuristic, least_contraining_value, forward_checking), debug=False)
draw_color_map(world, solution)
Random Graphs
We can use NetworkX to generate random graphs.
In [36]:
G = nx.gnp_random_graph(25, 0.25)
world = ((G.nodes(),G.edges()))
solution = color_map(world, params=(minimum_remaining_values, least_contraining_value, forward_checking), debug=False)
draw_color_map(world, solution)
Summary
The Backtracking-Search algorithm is a simple informed depth-first search algorithm that assigns values to variables given certain constraints. Once it determines that a value cannot be assigned, it backtracks to the previous assignment and tries a different value. This is similar to traversing down a tree until all nodes have values assigned to them that satisfies the given constraints.
This has probably been my favorite algorithm to work with so far, although implementing the backtracking algorithm was a bit harder than I expected. This is probably because of translation errors from psuedocode to actual python code. The psuedocode for the algorithm returns False or Assignments, which in python are two different objects (boolean and dictionary). Once I made the necessary adjustments the algorithm seemed to work like I expected. Another python frustration which I found out the hard way: "is not" =/= "!=". This was especially true when I was converting nodes and edges into an adjacency list and the map of europe was coming up with a near empty adjacency list.
Perhaps one part of CSP that isn't completely clear is the difference between AC-3, Forward Checking and MAC. I believe that AC-3 checks arc-consistency given the current variable and value assignment. Forward Checking removes the current value assignment from unassigned variables which prevents conflicts in future assignments. I think that MAC is a combination of both; it does Forward Checking by removing values from unassigned variables as well as propagates those changes forward like AC-3.
I did have problems with NetworkX and Matplotlib not wanting to plot images. Specifically, some of the images turn out to be incredibly small. If I run the cell again the problem seems to fix itself but it is puzzling why it doesn't work the first time around.
Finally, it seems that it is possible to generate a random graph that cannot be solved by this algorithm, probably because it is too densely connected. This makes me question my algorithm but upon reading the Four Color Theorem, it seems that the requirement be that nodes be adjacent and not at corners, which may be why some graphs cannot be solved with four colors.
In [37]:
# leave this cell for my testing.